[MLlib] org.apache.spark.mllib.util.SVMDataGenerator generates ArrayIndexOutOfBoundsException. I have found the bug and tested the solution. by j4munoz · Pull Request #13895 · apache/spark

j4munoz · 2016-06-24T19:02:48Z

What changes were proposed in this pull request?

Just adjust the size of an array in line 58 so it does not cause an ArrayOutOfBoundsException in line 66.

How was this patch tested?

Manual tests. I have recompiled the entire project with the fix, it has been built successfully and I have run the code, also with good results.

line 66: val yD = blas.ddot(trueWeights.length, x, 1, trueWeights, 1) + rnd.nextGaussian() * 0.1
crashes because trueWeights has length "nfeatures + 1" while "x" has length "features", and they should have the same length.

To fix this just make trueWeights be the same length as x.

I have recompiled the project with the change and it is working now:
[spark-1.6.1]$ spark-submit --master local[*] --class org.apache.spark.mllib.util.SVMDataGenerator mllib/target/spark-mllib_2.11-1.6.1.jar local /home/user/test

And it generates the data successfully now in the specified folder.

…yIndexOutOfBoundsException. I have found the bug and tested the solution. line 66: val yD = blas.ddot(trueWeights.length, x, 1, trueWeights, 1) + rnd.nextGaussian() * 0.1 crashes because trueWeights has length "nfeatures + 1" while "x" has length "features", and they should have the same length. To fix this just make trueWeights be the same length as x. I have recompiled the project with the change and it is working now: [spark-1.6.1]$ spark-submit --master local[*] --class org.apache.spark.mllib.util.SVMDataGenerator mllib/target/spark-mllib_2.11-1.6.1.jar local /home/user/test And it generates the data successfully now in the specified folder.

srowen · 2016-06-24T19:07:56Z

Jenkins test this please

SparkQA · 2016-06-24T19:57:39Z

Test build #61193 has finished for PR 13895 at commit a5ebe40.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-06-25T08:10:51Z

Merging to master and 2.0 / 1.6 as a clean bug fix. I also tested it.

…ndexOutOfBoundsException. I have found the bug and tested the solution. ## What changes were proposed in this pull request? Just adjust the size of an array in line 58 so it does not cause an ArrayOutOfBoundsException in line 66. ## How was this patch tested? Manual tests. I have recompiled the entire project with the fix, it has been built successfully and I have run the code, also with good results. line 66: val yD = blas.ddot(trueWeights.length, x, 1, trueWeights, 1) + rnd.nextGaussian() * 0.1 crashes because trueWeights has length "nfeatures + 1" while "x" has length "features", and they should have the same length. To fix this just make trueWeights be the same length as x. I have recompiled the project with the change and it is working now: [spark-1.6.1]$ spark-submit --master local[*] --class org.apache.spark.mllib.util.SVMDataGenerator mllib/target/spark-mllib_2.11-1.6.1.jar local /home/user/test And it generates the data successfully now in the specified folder. Author: José Antonio <joseanmunoz@gmail.com> Closes #13895 from j4munoz/patch-2. (cherry picked from commit a3c7b41) Signed-off-by: Sean Owen <sowen@cloudera.com>

…ndexOutOfBoundsException. I have found the bug and tested the solution. ## What changes were proposed in this pull request? Just adjust the size of an array in line 58 so it does not cause an ArrayOutOfBoundsException in line 66. ## How was this patch tested? Manual tests. I have recompiled the entire project with the fix, it has been built successfully and I have run the code, also with good results. line 66: val yD = blas.ddot(trueWeights.length, x, 1, trueWeights, 1) + rnd.nextGaussian() * 0.1 crashes because trueWeights has length "nfeatures + 1" while "x" has length "features", and they should have the same length. To fix this just make trueWeights be the same length as x. I have recompiled the project with the change and it is working now: [spark-1.6.1]$ spark-submit --master local[*] --class org.apache.spark.mllib.util.SVMDataGenerator mllib/target/spark-mllib_2.11-1.6.1.jar local /home/user/test And it generates the data successfully now in the specified folder. Author: José Antonio <joseanmunoz@gmail.com> Closes apache#13895 from j4munoz/patch-2. (cherry picked from commit a3c7b41) Signed-off-by: Sean Owen <sowen@cloudera.com> (cherry picked from commit 24d59fb)

j4munoz mentioned this pull request Jun 24, 2016

[SPARK][MLlib] ArrayIndexOutOfBoundsException fixed #13849

Closed

asfgit closed this in a3c7b41 Jun 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLlib] org.apache.spark.mllib.util.SVMDataGenerator generates ArrayIndexOutOfBoundsException. I have found the bug and tested the solution.#13895

[MLlib] org.apache.spark.mllib.util.SVMDataGenerator generates ArrayIndexOutOfBoundsException. I have found the bug and tested the solution.#13895
j4munoz wants to merge 1 commit intoapache:masterfrom
j4munoz:patch-2

j4munoz commented Jun 24, 2016

Uh oh!

srowen commented Jun 24, 2016

Uh oh!

SparkQA commented Jun 24, 2016

Uh oh!

srowen commented Jun 25, 2016 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

j4munoz commented Jun 24, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

srowen commented Jun 24, 2016

Uh oh!

SparkQA commented Jun 24, 2016

Uh oh!

srowen commented Jun 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

srowen commented Jun 25, 2016 •

edited

Loading